Louisville Civic Assembly

Polis is an open source wiki-survey platform for rapid, scalable, open ended feedback, in which participants submit short comments which are sent out semi-randomly to other participants to vote on (by clicking agree, disagree or pass). Polis uses statistical algorithms to find patterns of consensus and opinion groups.

This report looks at the data generated in an engagement run by Math & Democracy (pro-bono) in partnership with University of Kentucky, NPR and The American Assembly (part of Columbia University in February 2020. The poll asked residents of Louisville, Kentucky to respond to the question

What do you believe should change in Louisville to make it a better place to live, work and spend time?

Basic statistics

Of the raw data collected, we have:

ParticipantsCommentersCommentsVotesAgreesDisagreesVotes / participant (avg)Groups
1398302877124787669771745189.262

After removing moderated out comments, and participants who voted on fewer than 7 comments, we have:

ParticipantsCommentersCommentsVotesAgreesDisagreesVotes / participant (avg)
11641325731231396577517237105.78

Here we can see the distribution of these votes and comments over time as the conversation unfolded.

compiled vega png
compiled vega png

Comment overview

Next, we'll take a look at the variance in the data by plotting comments according to the number of agrees and disagrees. This data is plotted in a log plot due to the highly skewed nature of vote count distribution per comment. The grey line separates comments which were predominantly agreed with (bottom right) from those predominantly disagreed with (bottom left).

compiled vega png

Note that comments with far more disagrees than agrees had overall much lower vote counts. This is a direct result of the comment routing architecture of Polis, which deprioritizes comments which most people disagree with.

We can take these votes and arrange them into a matrix, where rows correspond to participants and columns correspond to statements. This allows us to think of participants as having positions in a high dimensional space (dimensionality equal to the number of comments).

compiled vega png

Overall opinion landscape

While the above visualization may be impressive, it's not particularly useful as far as understanding how participants opinions relate to each other. To better understand this, we can apply a dimensionality reduction algorithm, which allows us to capture as much of the variance within the data as we can within a lower dimensional space. Specifically, reducing to 2-dimensions allows us to plot participants locations in relation to each other in an opinion space, where participants are close together if they tend to agree, and further apart if they tend to disagree. Here, we're also coloring according to a K-means clustering of the participants into opinion groups, which lets us ask questions about what's important to different groups, and better understand the opinion landscape.

compiled vega png

Below, we can see the proportion of total variance explained by the x and y axes (the first two principal components) in the plot above:

[0.156 0.029]

The sharp decline in variance explained, from roughly 16% to 3% suggests a very sharp divide associated with comments corresponding to position along the X-axis, very dominant in predicting participants positions in the opinion landscape relative to other comments.

We can also take a look at this projection of participants in tandem with a projection a visualization of how each comment contributes to participants position in the opinion space. The arrows below represent the direction and magnitude a participant will be pushed in if they agree with a particular comment. You can hover over the arrow mark to see more about a particular comment of interest.

compiled vega png

The comments most strongly correlated with position along the X-axis:

compiled vega png

The comments most strongly correlated with position along the Y-axis:

compiled vega png

The most agreed on comments:

compiled vega png